Extraction of Lexico-Semantic Classes from Text

نویسندگان

  • Pablo Gamallo
  • Gabriel P. Lopes
  • Alexandre Agustini
چکیده

This paper describes an unsupervised method for extracting lexico-semantic classes from POS annotated corpora. The method consists in building bi-dimensional clusters of both words and local syntactic contexts. Each cluster, which represents a lexicosemantic class such as “entities in danger” is the result of merging its most prototypical constituents (words and contexts). The generated clusters will be used as centroids to word classification. The basic intuition underlying our corpus-based approach is that similar classes can be aggregated to generate either more specific or more generic classes, without inducing odd associations between contexts and words. A new class is generated by specification if we make the union of the constituent contexts (intension expansion) while the words are intersected (intension reduction). A new class is generated by abstraction if the local contexts are intersected (intension reduction), while we make the union of the constituent words (extension expansion). Intersecting words and local contexts in an accurate way allows us to generate tight clusters with prototypical constituents.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Incorporating Lexico-semantic Heuristics into Coreference Resolution Sieves for Named Entity Recognition at Document-level

This paper explores the incorporation of lexico-semantic heuristics into a deterministic Coreference Resolution (CR) system for classifying named entities at document-level. The highest precise sieves of a CR tool are enriched with both a set of heuristics for merging named entities labeled with different classes and also with some constraints that avoid the incorrect merging of similar mention...

متن کامل

Variation and Semantic Relation Interpretation: Linguistic and Processing Issues

Studies in linguistics define lexico-syntactic patterns to characterize the linguistic utterances that can be interpreted with semantic relations. Because patterns are assumed to reflect linguistic regularities that have a stable interpretation, several software implement such patterns to extract semantic relations from text. Nevertheless, a thorough analysis of pattern occurrences in various c...

متن کامل

Learning Arguments and Supertypes of Semantic Relations Using Recursive Patterns

A challenging problem in open information extraction and text mining is the learning of the selectional restrictions of semantic relations. We propose a minimally supervised bootstrapping algorithm that uses a single seed and a recursive lexico-syntactic pattern to learn the arguments and the supertypes of a diverse set of semantic relations from the Web. We evaluate the performance of our algo...

متن کامل

Extraction of Lexico-Syntactic Information and Acquisition of Causality Schemas for Text Annotation

We present the INSYSE method for the annotation of texts, based on extraction of semantic relations from syntactic structures. We apply this method to a corpus of 5000 Medline abstracts about central nervous system diseases and gene interactions. Our cooperative approach focuses on (1) extracting lexico-syntactic information from sentences in the corpus comprising causation lexemes and (2) elab...

متن کامل

Using Lexico-Syntactic Ontology Design Patterns for Ontology Creation and Population

In this paper we discuss the use of information extraction techniques involving lexico-syntactic patterns to generate ontological information from unstructured text and either create a new ontology from scratch or augment an existing ontology with new entities. We refine the patterns using a term extraction tool and some semantic restrictions derived from WordNet and VerbNet, in order to preven...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006